55 research outputs found

    Adversarial Machine Learning-Based Anticipation of Threats Against Vehicle-to-Microgrid Services

    Full text link
    In this paper, we study the expanding attack surface of Adversarial Machine Learning (AML) and the potential attacks against Vehicle-to-Microgrid (V2M) services. We present an anticipatory study of a multi-stage gray-box attack that can achieve a comparable result to a white-box attack. Adversaries aim to deceive the targeted Machine Learning (ML) classifier at the network edge to misclassify the incoming energy requests from microgrids. With an inference attack, an adversary can collect real-time data from the communication between smart microgrids and a 5G gNodeB to train a surrogate (i.e., shadow) model of the targeted classifier at the edge. To anticipate the associated impact of an adversary's capability to collect real-time data instances, we study five different cases, each representing different amounts of real-time data instances collected by an adversary. Out of six ML models trained on the complete dataset, K-Nearest Neighbour (K-NN) is selected as the surrogate model, and through simulations, we demonstrate that the multi-stage gray-box attack is able to mislead the ML classifier and cause an Evasion Increase Rate (EIR) up to 73.2% using 40% less data than what a white-box attack needs to achieve a similar EIR.Comment: IEEE Global Communications Conference (Globecom), 2022, 6 pages, 2 Figures, 4 Table

    A Comparative Study of AI-Based Intrusion Detection Techniques in Critical Infrastructures

    Get PDF
    Volunteer computing uses Internet-connected devices (laptops, PCs, smart devices, etc.), in which their owners volunteer them as storage and computing power resources, has become an essential mechanism for resource management in numerous applications. The growth of the volume and variety of data traffic on the Internet leads to concerns on the robustness of cyberphysical systems especially for critical infrastructures. Therefore, the implementation of an efficient Intrusion Detection System for gathering such sensory data has gained vital importance. In this article, we present a comparative study of Artificial Intelligence (AI)-driven intrusion detection systems for wirelessly connected sensors that track crucial applications. Specifically, we present an in-depth analysis of the use of machine learning, deep learning and reinforcement learning solutions to recognise intrusive behavior in the collected traffic. We evaluate the proposed mechanisms by using KDD\u2799 as real attack dataset in our simulations. Results present the performance metrics for three different IDSs, namely the Adaptively Supervised and Clustered Hybrid IDS (ASCH-IDS), Restricted Boltzmann Machine-based Clustered IDS (RBC-IDS), and Q-learning based IDS (Q-IDS), to detect malicious behaviors. We also present the performance of different reinforcement learning techniques such as State-Action-Reward-State-Action Learning (SARSA) and the Temporal Difference learning (TD). Through simulations, we show that Q-IDS performs with detection rate while SARSA-IDS and TD-IDS perform at the order of

    A Novel Ensemble Method for Advanced Intrusion Detection in Wireless Sensor Networks

    Get PDF
    © 2020 IEEE. With the increase of cyber attack risks on critical infrastructures monitored by networked systems, robust Intrusion Detection Systems (IDSs) for protecting the information have become vital. Designing an IDS that performs with maximum accuracy with minimum false alarms is a challenging task. Ensemble method considered as one of the main developments in machine learning in the past decade, it finds an accurate classifier by combining many classifiers. In this paper, an ensemble classification procedure is proposed using Random Forest (RF), DensityBased Spatial Clustering of Applications with Noise (DBSCAN) and Restricted Boltzmann Machine (RBM) as base classifiers. RF, DBSCAN, and RBM techniques have been used for classification purposes. The ensemble model is introduced for achieving better results. Bayesian Combination Classification (BCC) has been adopted as a combination technique. Independent BCC (IBCC) and Dependent BCC (DBCC) have been tested for performance comparison. The model shows a promising result for all classes of attacks. DBCC performs over IBCC in terms of accuracy and detection rates. Through simulations under a wireless sensor network scenario, we have verified that DBCC-based IDS works with \approx 100\% detection and \approx 1.0 accuracy rate in the existence of intrusive behavior in the tested Wireless Sensor Network (WSN)

    Anchor-Assisted and Vote-Based Trustworthiness Assurance in Smart City Crowdsensing

    Get PDF
    Smart city sensing calls for crowdsensing via mobile devices that are equipped with various built-in sensors. As incentivizing users to participate in distributed sensing is still an open research issue, the trustworthiness of crowdsensed data is expected to be a grand challenge if this cloud-inspired recruitment of sensing services is to be adopted. Recent research proposes reputation-based user recruitment models for crowdsensing; however, there is no standard way of identifying adversaries in smart city crowdsensing. This paper adopts previously proposed vote-based approaches, and presents a thorough performance study of vote-based trustworthiness with trusted entities that are basically a subset of the participating smartphone users. Those entities are called trustworthy anchors of the crowdsensing system. Thus, an anchor user is fully trustworthy and is fully capable of voting for the trustworthiness of other users, who participate in sensing of the same set of phenomena. Besides the anchors, the reputations of regular users are determined based on vote-based (distributed) reputation. We present a detailed performance study of the anchor-based trustworthiness assurance in smart city crowdsensing through simulations, and compare it with the purely vote-based trustworthiness approach without anchors, and a reputation-unaware crowdsensing approach, where user reputations are discarded. Through simulation findings, we aim at providing specifications regarding the impact of anchor and adversary populations on crowdsensing and user utilities under various environmental settings. We show that significant improvement can be achieved in terms of usefulness and trustworthiness of the crowdsensed data if the size of the anchor population is set properl

    On Cropped versus Uncropped Training Sets in Tabular Structure Detection

    Full text link
    Automated document processing for tabular information extraction is highly desired in many organizations, from industry to government. Prior works have addressed this problem under table detection and table structure detection tasks. Proposed solutions leveraging deep learning approaches have been giving promising results in these tasks. However, the impact of dataset structures on table structure detection has not been investigated. In this study, we provide a comparison of table structure detection performance with cropped and uncropped datasets. The cropped set consists of only table images that are cropped from documents assuming tables are detected perfectly. The uncropped set consists of regular document images. Experiments show that deep learning models can improve the detection performance by up to 9% in average precision and average recall on the cropped versions. Furthermore, the impact of cropped images is negligible under the Intersection over Union (IoU) values of 50%-70% when compared to the uncropped versions. However, beyond 70% IoU thresholds, cropped datasets provide significantly higher detection performance

    Multidomain transformer-based deep learning for early detection of network intrusion

    Full text link
    Timely response of Network Intrusion Detection Systems (NIDS) is constrained by the flow generation process which requires accumulation of network packets. This paper introduces Multivariate Time Series (MTS) early detection into NIDS to identify malicious flows prior to their arrival at target systems. With this in mind, we first propose a novel feature extractor, Time Series Network Flow Meter (TS-NFM), that represents network flow as MTS with explainable features, and a new benchmark dataset is created using TS-NFM and the meta-data of CICIDS2017, called SCVIC-TS-2022. Additionally, a new deep learning-based early detection model called Multi-Domain Transformer (MDT) is proposed, which incorporates the frequency domain into Transformer. This work further proposes a Multi-Domain Multi-Head Attention (MD-MHA) mechanism to improve the ability of MDT to extract better features. Based on the experimental results, the proposed methodology improves the earliness of the conventional NIDS (i.e., percentage of packets that are used for classification) by 5x10^4 times and duration-based earliness (i.e., percentage of duration of the classified packets of a flow) by a factor of 60, resulting in a 84.1% macro F1 score (31% higher than Transformer) on SCVIC-TS-2022. Additionally, the proposed MDT outperforms the state-of-the-art early detection methods by 5% and 6% on ECG and Wafer datasets, respectively.Comment: 6 pages, 7 figures, 3 tables, IEEE Global Communications Conference (Globecom) 202

    Table Detection for Visually Rich Document Images

    Full text link
    Table Detection (TD) is a fundamental task towards visually rich document understanding. Current studies usually formulate the TD problem as an object detection problem, then leverage Intersection over Union (IoU) based metrics to evaluate the model performance and IoU-based loss functions to optimize the model. TD applications usually require the prediction results to cover all the table contents and avoid information loss. However, IoU and IoU-based loss functions cannot directly reflect the degree of information loss for the prediction results. Therefore, we propose to decouple IoU into a ground truth coverage term and a prediction coverage term, in which the former can be used to measure the information loss of the prediction results. Besides, tables in the documents are usually large, sparsely distributed, and have no overlaps because they are designed to summarize essential information to make it easy to read and interpret for human readers. Therefore, in this study, we use SparseR-CNN as the base model, and further improve the model by using Gaussian Noise Augmented Image Size region proposals and many-to-one label assignments. To demonstrate the effectiveness of proposed method and compare with state-of-the-art methods fairly, we conduct experiments and use IoU-based evaluation metrics to evaluate the model performance. The experimental results show that the proposed method can consistently outperform state-of-the-art methods under different IoU-based metric on a variety of datasets. We conduct further experiments to show the superiority of the proposed decoupled IoU for the TD applications by replacing the IoU-based loss functions and evaluation metrics with proposed decoupled IoU counterparts. The experimental results show that our proposed decoupled IoU loss can encourage the model to alleviate information loss

    Handling big tabular data of ICT supply chains: a multi-task, machine-interpretable approach

    Full text link
    Due to the characteristics of Information and Communications Technology (ICT) products, the critical information of ICT devices is often summarized in big tabular data shared across supply chains. Therefore, it is critical to automatically interpret tabular structures with the surging amount of electronic assets. To transform the tabular data in electronic documents into a machine-interpretable format and provide layout and semantic information for information extraction and interpretation, we define a Table Structure Recognition (TSR) task and a Table Cell Type Classification (CTC) task. We use a graph to represent complex table structures for the TSR task. Meanwhile, table cells are categorized into three groups based on their functional roles for the CTC task, namely Header, Attribute, and Data. Subsequently, we propose a multi-task model to solve the defined two tasks simultaneously by using the text modal and image modal features. Our experimental results show that our proposed method can outperform state-of-the-art methods on ICDAR2013 and UNLV datasets.Comment: 6 pages, 7 tables, 4 figures, IEEE Global Communications Conference (Globecom), 202

    Quantifying User Reputation Scores, Data Trustworthiness, and User Incentives in Mobile Crowd-Sensing

    Get PDF
    Ubiquity of mobile devices with rich sensory capabilities has given rise to the mobile crowd-sensing (MCS) concept, in which a central authority (the platform) and its participants (mobile users) work collaboratively to acquire sensory data over a wide geographic area. Recent research in MCS highlights the following facts: 1) a utility metric can be defined for both the platform and the users, quantifying the value received by either side; 2) incentivizing the users to participate is a non-trivial challenge; 3) correctness and truthfulness of the acquired data must be verified, because the users might provide incorrect or inaccurate data, whether due to malicious intent or malfunctioning devices; and 4) an intricate relationship exists among platform utility, user utility, user reputation, and data trustworthiness, suggesting a co-quantification of these inter-related metrics. In this paper, we study two existing approaches that quantify crowd-sensed data trustworthiness, based on statistical and vote-based user reputation scores. We introduce a new metric - collaborative reputation scores - to expand this definition. Our simulation results show that collaborative reputation scores can provide an effective alternative to the previously proposed metrics and are able to extend crowd sensing to applications that are driven by a centralized as well as decentralized control

    Collaborative Feature Maps of Networks and Hosts for AI-driven Intrusion Detection

    Full text link
    Intrusion Detection Systems (IDS) are critical security mechanisms that protect against a wide variety of network threats and malicious behaviors on networks or hosts. As both Network-based IDS (NIDS) or Host-based IDS (HIDS) have been widely investigated, this paper aims to present a Combined Intrusion Detection System (CIDS) that integrates network and host data in order to improve IDS performance. Due to the scarcity of datasets that include both network packet and host data, we present a novel CIDS dataset formation framework that can handle log files from a variety of operating systems and align log entities with network flows. A new CIDS dataset named SCVIC-CIDS-2021 is derived from the meta-data from the well-known benchmark dataset, CIC-IDS-2018 by utilizing the proposed framework. Furthermore, a transformer-based deep learning model named CIDS-Net is proposed that can take network flow and host features as inputs and outperform baseline models that rely on network flow features only. Experimental results to evaluate the proposed CIDS-Net under the SCVIC-CIDS-2021 dataset support the hypothesis for the benefits of combining host and flow features as the proposed CIDS-Net can improve the macro F1 score of baseline solutions by 6.36% (up to 99.89%).Comment: IEEE Global Communications Conference (Globecom), 2022, 6 pages, 3 figures 4 table
    corecore